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© A dictionary based post-processing technique for 
an on-line handwriting recognition system is de- 
scribed. An input word has all punctuation removed, 
and the word is checked against a word processing 
dictionary. If any word matches against the dic- 
tionary, it is verified as a valid word. If it does not 
verify, a stroke match function and a spell-aid dic- 
tionary are used to construct a list of possible words. 
In some cases, the list is appended with possible 
words based on changing the first character of the 
originally recognized word. A character-match score, 
a substitution score and a word length are assigned 
to the items on the list. A word hypothesis is con- 
structed from the list with each such word being 
assigned a score. The word with the best score is 
chosen as the output word for the processor. 
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Field of the Invention 

The invention is in the field of handwriting 
recognition, and specifically is directed to post- 
processing error correction. In particular, the error 
correction is accomplished using a dictionary. 

BACKGROUND OF THE INVENTION 

Because of similar shapes, letters such as "v" 
and "u"; "k" and "h"; "1", "I", and "I"; and so on, 
any on-line recognition of handwriting letters cannot 
avoid producing errors. According to the present 
invention, these errors and errors caused by other 
sources are corrected utilizing a dictionary-driven 
error correction post-processing technique for 
handwriting recognition. 

Various techniques have been utilized in char- 
acter recognition systems and the like which in- 
clude dictionaries, but none have been found utiliz- 
ing the techniques found in this invention. 

U.S. Patent 4,653,107 to Shojima et al dis- 
closes a system in which coordinates of a 
"handwritten" pattern drawn on a tablet are se- 
quentially sampled by a pattern "recognition" unit 
to prepare pattern coordinate data. Based on an 
area encircled by segments created by the sam- 
pled pattern coordinate data of one stroke and a 
line connecting a start point and an end point of 
the one-stroke coordinate data, the sampled pattern 
coordinate data of the one stroke is converted to a 
straight line and/or curved line segments. The con- 
verted segments are quantized and normalized. 
The segments of the normalized input pattern are 
rearranged so that the input pattern is drawn in a 
predetermined sequence. Differences between di- 
rection angles for the rearranged segments are 
calculated. Those differences are compared with 
differences of the direction, angles of the 
"dictionary" patterns read from a memory to cal- 
culate a difference therebetween. The matching of 
the input pattern and the "dictionary" pattern is 
determined in accordance with the difference. If the 
matching fails, the first or last inputted segment of 
the input pattern is deleted or the sampled pattern 
coordinate data of the next stroke is added, to 
continue the "recognition" process. 

U.S. Patent 5,034,991 to Hagimae et al dis- 
closes a character "recognition" method and sys- 
tem in which a character indicated in a printed, 
stamped, carved or other form is two-dimensionally 
imaged and stored as image data and the stored 
image data is subjected to an image processing to 
"recognize" the character. The "recognition" of the 
character is preformed in such a manner that each 
time the comparison of plural kinds of feature vec- 
tors extracted from the character to be 
"recognized" and a "dictionary" vector of each 



candidate character in a group of candidate char- 
acters preliminarily prepared is made for one of the 
plural kinds of feature vectors, a candidate char- 
acter having its "dictionary" vector away from the 
5 extracted feature vector by a distance not smaller 
than a predetermined value is excluded from the 
candidate character group. The "dictionary" vector 
for each candidate character is defined as an aver- 
age vector for a variety of fonts: A difference be- 

w tween the "dictionary" vector and the feature vec- 
tor extracted from the character to be "recognized" 
is estimated by virtue of a deviation vector for the 
variety of fonts to produce an estimated value. The 
exclusion from the candidate character group is 

is judged on the basis of the estimated values each 
of which is cumulatively produced each time the 
estimation for the difference is made. 

U. S. Patent 5,020,117 to Ooi et al discloses a 
system in which "recognition" character candidates 

20 and their similarities for each character obtained by 
a character "recognition" section from an input 
character string are stored in a first "recognition" 
result memory, and "recognition" character can- 
didates obtained by rotating the corresponding 

25 characters through 180 degrees and their similarit- 
ies are stored in a second "recognition" result 
memory. Address pointers for accessing the first 
and second "recognition" result memories are 
stored in an address pointer memory. The first 

30 "recognition" result memory is accessed in accor- 
dance with the address pointers read out from the 
address pointer memory in an ascending order, 
and the second "recognition" result memory is 
accessed in accordance with the address pointers 

35 read out from the address pointer memory in a 
descending order. Coincidences between 
"recognition" candidates read out from the first and 
second "recognition" result memories and char- 
acter strings of "dictionary" words read out from a 

40 "dictionary" memory are computed by a coinci- 
dence computing section. A "recognition" result of 
the input character string is obtained based on the 
coincidence. 

U.S. Patent 5,010,579 to Yoshida et al dis- 

45 closes a hand-written, on-line character 
"recognition" apparatus, and the method employed 
by it, in which the structure of a "dictionary" for 
"recognition" is formed as a sub-routine type, 
whereby the "dictionary" can be made small in 

so size and a time necessary for "recognition" can be 
reduced. 

In commonly assigned U.S. Patent 5,029,223, 
July 2, 1991, Fujisaki discloses a method and 
apparatus for identifying a valid symbol or a string 
55 of valid symbols from a sequence of handwritten 
strokes. A method includes the steps of (a) gen- 
erating in response to one or more handwritten 
strokes a plurality of stroke labels each having an 
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associated score; (b) processing the plurality of 
stroke labels in accordance with a beam search- 
like technique to identify those stroke labels indica- 
tive of a valid symbol or portion of a valid symbol; 
and (c) associating together identified stroke labels 
to determine an identity of a valid symbol or a 
string of valid symbols therefrom. An aspect of the 
invention is that each of the constraint validation 
filters is switchably coupled into a serial filter chain. 
The switches function to either couple a fitter input 
to a stroke label or decouple the input and provide 
a path around the filter block. An application writer 
has available a plurality of constraint filters. The 
application writer specifies which one or ones of 
the constraint filters are to be applied for a specific 
sequence of strokes. Fujisaki is incorporated herein 
by reference. 

As stated above, the present invention utilizes 
a dictionary for post-processing error correction in 
an on-line handwriting recognition. The just dis- 
cussed patents do not teach or suggest the use of 
a dictionary for such a purpose. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 

is a block diagram of a dictionary based post- 
processor for an on-line handwriting recognition 
system; 
FIG. 2 

is a general block diagram of the dictionary 
post-processor of FIG. 1; 
FIGS. 3A and 3B, 

when taken together as shown in FIG. 3, com- 
prise a general flow chart of the dictionary post- 
processor of FIG. 2; 
FIGS. 4A-4F, 

when taken together as shown in FIG. 4, com- 
prises a detailed flow chart of the dictionary 
post-processor of FIG. 3; 
FIG. 5 

is a flow chart of the stroke match block 34 of 
FIGS. 2 and 3; 
FIG. 6 

is a flow chart of the spell-aid block 36 of FIGS. 
2 and 3. 

DISCLOSURE OF THE INVENTION 

A dictionary based post-processing technique 
is disclosed for an on-line handwriting recognition 
system. An input word has all punctuation re- 
moved, and the word is checked against a word 
processing dictionary. H any word matches against 
the dictionary, it is verified as a valid word. If it 
does not verify, a stroke match function and a 
spell-aid dictionary are used to construct a list of 
possible words. In some cases, the list is appen- 



ded with possible words based on changing the 
first character of the originally recognized word. A 
character-match score, a substitution score and a 
word length are assigned to the items on the list. A 
5 word hypothesis is constructed from the list with 
each such word being assigned a score. Trie word 
with the best score is chosen as the output word 
for the processor. 

10 BEST MODE OF CARRYING OUT THE INVEN- 
TION 

Referring to FIG. 1 there is shown in block 
diagram form a character recognition system 10 

75 that includes a segmentation processor 1 2 coupled 
between an electronic tablet 14 and a character 
recognizer 18. Tablet 14 can be any of a number 
of suitable commercially available electronic tab- 
lets. The tablet 14 has an associated stylus or pen 

20 15 with which, in a pen-down position, a user forms 
symbols, such as block printing or script alphanu- 
meric characters, on a surface of the tablet 14. The 
tablet 14 has x-axis and y-axis output signals ex- 
pressive of the position of the pen 15 on an x-y 

25 tablet coordinate system. A stroke capture means 
16 may be a software task which intercepts the x-y 
outputs from the tablet to generate x-y position pair 
data for the segmentation processor 12. An output 
of the segmentation processor 12 is data expres- 

30 sive of connected strokes and unconnected strokes 
which is input to the character recognizer 18 of the 
invention. The character recognizer 18 operates to 
determine an identity of a connected group of 
segmented strokes and has an output 18a expres- 

35 sive of identified symbols such as alphanumeric 
characters. 

In this regard it should be realized that the 
invention is applicable to the recognition of a num- 
ber of hand-drawn symbols wherein a given sym- 

40 bol is composed of at least one segmented stroke. 
By employing the teaching of the invention the 
system 10 readily recognizes symbols associated 
with written characters of various languages and 
also mathematical and other types of symbols. 

45 The output of character recognizer 18 on line 
18a is provided to search block 19 which provides 
a top answer on line 19a and a cache of best 
matched strokes on line 19b. A more' detailed de- 
scription of blocks 14-19 can be found in Fujisaki, 

so U.S. 5,029,233 which has been incorporated herein 
by reference. Dictionary post-processing is then 
accomplished in post-processing block 20 which 
compares the top answer words on line 19a with 
words in a dictionary 22 to produce an output word 

55 on line 23. 

The top answer is a result of a search which 
results in the best candidate for a recognized word, 
and is an input word to the post-processor 20. The 
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cache of best matched strokes is a result of a 
search of the best strokes to form a word. 

Refer now to FIG. 2 which is a block diagram 
of the dictionary post-processor 20 of FIG. 1. A 
punctuation filter 24 receives the top word on line 
19a and the cache of different paths signal on line 
19b and removes all punctuation from the word. At 
a verification block 26, a recognized sequence of 
characters is matched against the dictionary in 
block 28 to see if any word exists with that spell- 
ing. This match is cache insensitive. If there is a 
match, the word is provided to the unified cache 
block 30 and an output word is provided on line 31 . 
If on the other hand there is no verification, a 
character match score is computed at block 32, 
and if the score per character is at a predetermined 
level, the word is output to block 30, and an output 
word is provided on line 31 at block 34 a stroke 
match is computed. Along the search process, the 
top match hypotheses are kept in a cache along 
with their scores. Basically, the objective of this 
module is to generate all the possible words from 
the strokes in this cache and to calculate the total 
matching score of each of these words. Then the 
end word hypotheses with the best matching 
scores are inserted into a global word hypotheses 
list. The output of the match 34 is then provided to 
a spell-aid block 36 which takes the recognition 
output and tries to find a word in the dictionary 
which resembles this sequence of characters and 
inserts them in the global word hypotheses list. 
This model's output is very dependent on the initial 
characters. Therefore, most of the time, the first 
character is retained. Therefore, if the match score 
of the first character is worse than other characters 
in the word, then it is replaced in block 38. At block 
40, three types of scores are assigned to the 
replacement characters, and at block 42, the best 
hypothesis in the list is determined. At block 44, 
the best hypothesis is used as the final word, and 
punctuation is reinserted at block 46 with an output 
word being provided on line 48. 

Refer now to FIG. 3 which is a more detailed 
block diagram of the dictionary post-processor 20. 
In FIG. 3A, at block 24, the top answer is provided 
on input line 19a and the cache of best matched 
strokes signal is provided on line 19b, and the 
punctuation is removed from the top answer input 
at block 24. At decision block 26, a determination 
is made whether the word is verified in the dic- 
tionary. If so, the word is provided to the unified 
cache block 30 and an output word is provided on 
line 31. If on the other hand, the word does not 
exist in the dictionary, proceed to character match 
score block 32 which is comprised of blocks 50 
and 52. In block 50, scores are calculated for each 
character in the top answer word. At decision block 
52, a determination is made whether or not the 



worse character score is better than a predeter- 
mined threshold. If so, proceed to block 30 to unify 
the case and provide an output word on line 31. If 
not, proceed to block 34 where a stroke match is 

5 made using the cache to find all combinations of 
strokes which will verify. At spell-aid block 36, a 
standard word processor dictionary is used to get 
some suggested words. 

Proceed next to first character replacement 

w block 38 in FIG. 3B which is comprised of blocks 
54 and 56. In decision block 54, a determination is 
made whether or not the first character in the top 
answer has the worst character match score among 
all the characters in the top answer. If so, proceed 

75 to block 56 and get a hypothesis by changing the 
first character using statistics of first characters in a 
word. Proceed then to block 40 and assign a 
character-match score to the top answer and sub- 
stitution scores and word length scores to all hypo- 

20 theses. In block 42, find the hypothesis with the 
best of all relative scores based on the following 
precedent: 1 . word length 2. substitution 3. relative 
character-match. After this determination, proceed 
to block 44 and unify the case with the hypoth- 

25 esized word which is then provided to block 46 
where punctuation is reinserted and an output word 
is provided on line 48. 

Refer now to FIG. 4. FIG. 4 which is a detailed 
flow chart of the operation of the post-processor 

30 block 20. The flow chart starts at block 60 of FIG. 
4A, and at block 62, a sequence of characters is 
extracted from the top search path and are stored 
in "word" and "original word". At block 64, all 
punctuation is removed from the "word" and the 

35 punctuation is stored. At block 66, a case insensi- 
tive match of "word" is made against the dictionary 
database. If "word" is made up of non-alphabetical 
characters only, then it is verified. If it has any 
special characters such as punctuation marks etc, 

40 they are separated and kept in a separate portion 
called "punctuation". At decision blocks 68, a de- 
termination is made if the "word" is verified, ff so, 
proceed to block 70 and if the first character of the 
"original word" is upper case, retain its case. Count 

45 the number of lower and upper case characters 
and convert all character cases to the majority case 
in the "original word". The "original word" is then 
provided as an output word on line 72. 

If in decision blocks 68 the word is not verified, 

so go to block 74 of FIG. 4B where the shape match- 
ing score for each character in the word is looked 
up. At decision block 76, it is determined if the 
sum of scores is less then a threshold times the 
length of the "word". If so, return to block 70 (FIG. 

55 4A) and generate an output word on line 72. If the 
determination is that the sum of scores is not less 
than the threshold times the length of the word, 
proceed to block 78 where a linear transformation 
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of the scores is performed such that the highest 
score is mapped to zero and zero is mapped to the 
highest score. Call the new scores "character- 
match" scores. For those characters in the "word" 
which have no match scores associated with them, 
assign a character-match score of "-1" which is 
worse than ail other scores. At block 80, for the 
word hypothesis given by stroke-match and spell- 
aid, if their characters have a match score asso- 
ciated, then transform those match-scores using 
the above linear transformation and assign these 
scores to those characters as the character-match 
score. If no match score is available, then use "0" 
as their character-match score. At block 82, get a 
list of suggested words from the stroke match, and 
proceed to block 84 and append to this list a list of 
words suggested by the spell-aid. 

At blocks 86 of FIG. 4C, if the first character in 
"word" has transformed score "0" or "-1", then try 
changing the first character with other characters 
given through a study of the probability of char- 
acters occurring in the beginning of a word from a 
predetermined word corpus, for example, 
320,000,000 containing a predetermined number of 
distinct words, for example, 270,000 such words. At 
block 88, for each word hypothesis given by 
stroke-match and spell-aid, insert the number of 
substitutions that will make the word compared to 
the original word. Call this "substitution score" 
(SS). Proceed then to decision block 90 for the 
determination of whether or not this is a strong 
dictionary, ff not, proceed to block 92 of FIG. 4C 
where L equals the length of the "original word" 
and SS equals the substitution score. At decision 
block 94, a test is made to keep the error correc- 
tion robust. 

If not robust, proceed to block 96 and set SS 
equal to -1 and proceed to block 98 of FIG. 4D and 
for every word hypothesized by stroke-match and 
spell-aid, find the difference in their length with 
"word". This same path is taken if there is a strong 
dictionary decision at block 90. Be careful with 
presence of punctuations. Call this the "word- 
length" score. At block 100, loop through all word 
hypotheses and their scores. At block 102 find all 
hypotheses with the smallest word-length score. At 
block 104, find those hypotheses with the smallest 
substitution scores. Proceed to block 106 of FIG. 
4E and among these word hypotheses find the 
hypotheses which tend to make the most substitu- 
tions for the position of characters which had a 
character match score of "-1" in "word". At block 
108 among the remaining hypotheses, find that 
hypothesis which has the smallest sum of absolute 
values of the difference of the character-match 
scores of "word" and this hypothesis is kept. In 
block 110, if the remaining list of hypotheses has 
more than one element, then keep that hypotheses 



which originated from the stroke match module. 

At decision block 112 of FIG. 4E, a determina- 
tion is made if the first character of the "original 
word" is upper case. If not, go to block 114 and set 

5 the upper case equal to zero, and proceed to block 
1 1 8. If so, proceed to block 116 and set the upper 
case equal to one. Proceed then to decision block 
118 where a determination is made whether or not 
most of the characters in the "original word" are 

70 lower case, ff so, proceed to block 120 and set the 
lower case equal to one and then proceed to block 
1 24. If in decision block 118 most of the characters 
in the original word are not lower case proceed to 
block 122 and set lower case equal to zero. At 

75 decision block 124, a determination is made if the 
lower case equals zero. If not, proceed to block 
126 and turn all characters in the hypotheses into 
lower case. If lower case equals to zero in decision 
block 124, proceed to block 134 and turn all char- 

20 acters in the hypothesis into upper case, and then 
proceed to block 132 at FIG. 4F. At block 128 of 
FIG. 4F, a determination is made if the upper case 
equals one, if not, proceed to block 132. If so, 
proceed to block 130 and turn the first character in 

25 the hypothesis into upper case. Proceed then to 
block 132 and copy the hypotheses into the origi- 
nal word. At block 136 punctuation is reinserted in 
the word and an output word is provided on line 
138. 

30 Refer now to FIG. 5 which is a detailed flow 

chart of the stroke match block 34 shown in FIGS. 
2 and 3. Trie flow chart begins at 140. At block 
142, the top score stroke hypothesis is taken from 
the stroke matcher and all combinations of strokes 

35 are found which make valid words in the dictionary. 
At block 144 all these scores for the strokes in 
each word hypotheses are added. At block 146 a 
list of "N" hypotheses with the best total score is 
made and a return list is made at block 148. 

40 Refer now to FIG. 6 which is a detailed flow 

chart of the spell-aid block 36 of FIGS. 2 and 3. 
The flow chart is started at block 150, and at block 
152, the "word" is passed to word processor spell 
checker to obtain the first six words which most 

45 resemble the "word" and these are returned to the 
list at block 154. 

INDUSTRIAL APPLICABILITY 

so It is an object of the invention to provide an 

improved handwriting recognition system. 

It is an object of the invention to provide an 
improved handwriting recognition system utilizing a 
dictionary based post-processing technique. 
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Claims 

1. A method of using a dictionary for on-line 
hand-writing recognition, said method compris- 
ing the steps of: 5 
providing a candidate word for recognition, 
where said candidate word is made up of a 
sequence of at least one character which is 
made up of a sequence of at least one stroke; 
determining if the sequence of characters in w 
said candidate word matches a word in the 
dictionary with the same spelling, and if so, 
providing the candidate word as an output 
word; and if not 

calculating a recognition score for each char- 75 
acter in said candidate word; 
determining if the worst character score in said 
recognition score for each character is better 
than a predetermined threshold, and if so, pro- 
viding said candidate word as an output word; 20 
and if not 

finding all combinations of strokes that produce 
a recognizable character to be used in place of 
the character with the worst character score; 
assigning scores to each of the recognizable 25 
characters; 

replacing the, character with the worst char- 
acter score in said candidate word with the 
recognizable character with the highest as- 
signed score to produce a new candidate 30 
word; and 

providing said new candidate word as an out- 
put word. 

2- The method of claim 1 , including the step of: 35 
removing punctuation from said candidate 
word. 

3* The method of claim 1 or 2, including the step 

of: 40 
inserting the punctuation removed from said 
candidate word in said provided candidate 
word. 

4. The method of claim 1, 2 or 3, including the 45 
step of: 

inserting the punctuation removed from said 
candidate word in the new candidate word. 
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precedence; 

1. word length 

2. SUBSTITUTION RELATIVE 

3. CHARACTER-MATCH 
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FIG. 3B 



REATTACH . A , 
PUNCTUATION V~ 46 

^-48 
EXIT 
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EP 0 564 827 A2 



START 
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FIG. 4A 



EXTRACT SEQUENCE OF CHARACTERS 
FROM THE TOP SEARCH PATH 
AND STORE IN 'WORD' 
AND 'ORIG WORD* 
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DD A CASE INSENSITIVE MATCH DF 
"VDRD* AGAINST DICTIONARY DATA BASE 
IF 'VDRD* IS MADE UP OF 
NON-ALPHABETIC CHARACTERS ONLY, 

THEN IT IS VERIFIED. IF IT 
HAS ANY SPECIAL CHARACTERS!'^** 1 ' 
THEY ARE TURNED TO CO AND THEN 
'WORD' IS VERIFIED 
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FIG.4 



REMOVE 


ALL 


PUNCTUATION FROM 
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FIG, 


THE 


END 


OF 


'WORD' AND 




PLACE 


THEM 


IN 


'PUNCTUATION' 
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FIG, 
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FIG. 



FIG. 
4D 



FIG. 
4E 



FIG. 
4F 




IF THE FIRST CHARACTER OF 
'ORIG WORD' IS UPPERCASE RETAIN 
ITS CASE. COUNT THE NUMBER OF 
LOWER AND UPPERCASE CHARACTERS 
AND CONVERT ALL CHARACTER CASES TO 
THE MAJORITY'S CASE IN 'ORIG WORD' 



--72 
'ORIG WORD' 



11 



EP 0 564 827 A2 



FI6.4B 



LOOK UP THE SHAPE 
MATCHING SCORE FOR EACHJ^ 
CHARACTER IN 'WORD* 
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SUM OF 
SCORES < THRESHOLD X 
LENGTH OF 'WORD* 
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YES 
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DO A LINEAR TRANSFORMATION TO THE SCORES SUCH 

THAT THE HIGHEST SCORE IS MAPPED TO ZERO 

AND ZERO IS MAPPED TO THE 
THE HIGHEST SCORE. CALL THE NEV SCORES 
'CHARACTER-MATCH' SCORES. FOR THOSE 
CHARACTERS IN "WORD* WHICH HAVE ND MATCH SCORES 
ASSOCIATED WITH THEM, ASSIGN A CHARACTER-MATCH 
SCORE OF WHICH IS WORSE THAN 
ALL OTHER SCORES. 
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FOR THE WORD HYPOTHESES GIVEN BY STROKE-MATCH 
AND SPELL AID, IF THEIR CHARACTERS HAVE A MATCH 
SCORE ASSOCIATED, THEN TRANSFORM THOSE 
MATCH-SCORES USING THE ABOVE LINEAR 

TRANSFORMATION AND ASSIGN THEM TO THOSE 
CHARACTERS AS THE CHARACTER-MATCH SCORE. 
IF ND MATCH SCORE IS AVAILABLE, THEN USE 
*D' AS THEIR CHARACTER-MATCH SCORE. 



GET A LIST OF SUGGESTED 
WORDS FROM STROKE MATCH. 
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APPEND TO THIS LIST, A LIST 
OF WORDS SUGGESTED BY SPELL AID. 
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IF THE FIRST CHARACTER IN 'WORD* HAS 
TRANSFORMED SCORE *D* OR *-l*, THEN 
TRY CHANGING THE FIRST CHARACTER WITH OTHER 
CHARACTERS GIVEN THRDUGH A STUDY DF THE 
PROBABILITY OF CHARACTERS OCCURING IN THE 
BEGINING DF A WORD FROM A 320,000,000 WORD 
CORPUS CONTAINING 270,000 DISTINCT WORDS. 



FOR EACH WORD HYPTHESIS GIVEN BY 
SCORE-MATCH, SPELL-AID, AND FIRST 
CHARACTER REPLACEMENT INSERT 
THE NUMBER OF SUBSTITUTIONS THAT WORD 
MAKES CDMPARED TO *ORIG WORD'. CALL - 
THIS 'SUBSTITUTION SCORE* <SS). 
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YES 




FIG.4C 



L = LENGTH DF *ORIG WORD* 
SS = SUBSTITUTION SCORE 
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1 



SS = -1 



YES 
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FDR EVERY WORD HYPOTHESIZED BY 
STROKE-MATCH AND SPELL-AID, 
FIND THE DIFFERENCE IN THEIR LENGTH 

WITH 'WORD'. BE CAREFUL WITH 
PRESENCE OF PUNCTUATIONS CALL THIS 
'WORD-LENGTH' SCORE. 



FIG. 4D 
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LOOP THROUGH ALL WORD 
HYPOTHESES AND THEIR SCORES' 
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FIND ALL HYPOTHESES WITH +—102 
THE SMALLEST WORD-LENGTH SCDRE. 



FOR THESE HYPOTHESES FIND 
THDSE HYPOTHESES WITH THE 
SMALLEST SUBSTITUTION SCORES. 
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AMONG THESE WORD HYPOTHESES FIND THAT 
HYPOTHESES WHICH TEND TO MAKE THE MOST^_ 

SUBSTITUTIONS FOR THE POSITION OF T~106 
CHARACTERS WHICH HAD A CHARACTER-MATCH 
SCORE OF '-1' IN 'WORD'. 



AMONG THE REMAINING HYPOTHESES FIND THAT 
HYPOTHESES WHICH HAS THE SMALLEST SUM OF 
ABSOLUTE VALUE OF THE DIFFERENCES OF THE 
CHARACTER-MATCH SCORES OF 'WORD' AND 
THE HYPOTHESES IS KEPT 
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IF THE REMAINING LIST OF HYPOTHESES HAS 

MORE THAN ONE ELEMENT, THEN KEEP 
THAT HYPOTHESIS WHICH ORIGINATED FRDM 
THE STROKE MATCH MODULE. 
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INTO UPPERCASE 



TURN ALL CHARACTERS 
IN HYPOTHESIS 
INTO, LOWERCASE 
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FIG. 4F 
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YES 



TURN FIRST CHARACTER 
IN HYPOTHESIS TO UPPERCASE 



COPY HYPOTHESIS 
INTO 'ORIG WORD* 



ATTACH 
PUNCTUATION 
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132 



136 



OUTPUT 
WORD 
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STROKE MATCH 



FIG. 5 



START 
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/ 
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TAKE THE TDP SCORING STROKE 
HYPOTHESES FROM THE STRDKE MATCHER 
AND FIND ALL COMBINATIONS OF STROKES 
WHICH MAKE VALID WORDS IN THE DICTIONARY. 
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ADD ALL THE SCORES FOR THE 
STROKES IN EACH WORD HYPOTHESIS 



MAKE A LIST OF *N' HYPOTHESES 
WITH THE BEST TOTAL SCORE 



RETURN LIST 
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FIG. 6 



SPELL AID 



START 
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36 

/ 



PASS 'WORD' TO A WDRD-PROCESSOR 
SPELL CHECKER AND OBTAIN THE FIRST 
*M"WDRDS WHICH BEST RESEMBLE "WORD' 
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1 



RETURN LIST 



154 
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